An exact algorithm for semi-supervised minimum sum-of-squares clustering

نویسندگان

چکیده

The minimum sum-of-squares clustering (MSSC), or k-means type clustering, is traditionally considered an unsupervised learning task. In recent years, the use of background knowledge to improve cluster quality and promote interpretability process has become a hot research topic at intersection mathematical optimization machine research. problem taking advantage information in data called semi-supervised constrained clustering. this paper, we present branch-and-cut algorithm for MSSC, where incorporated as pairwise must-link cannot-link constraints. For lower bound procedure, solve semidefinite programming relaxation MSSC discrete model, cutting-plane procedure strengthening bound. upper bound, instead, by using integer tools, adaptation case. first time, proposed global efficiently manages real-world instances up 800 points with different combinations constraints generic number features. This size about four times larger than one solved state-of-the-art exact algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Incremental DC Algorithm for the Minimum Sum-of-Squares Clustering

Here, an algorithm is presented for solving the minimum sum-of-squares clustering problems using their difference of convex representations. The proposed algorithm is based on an incremental approach and applies the well known DC algorithm at each iteration. The proposed algorithm is tested and compared with other clustering algorithms using large real world data sets.

متن کامل

An Incremental DC Algorithm for the Minimum Sum-of-Squares Clustering

Clustering is an unsupervised technique dealing with problems of organizing a collection of patterns into clusters based on similarity. Most clustering algorithms are based on hierarchical and partitional approaches. Algorithms based on an hierarchical approach generate a dendrogram representing the nested grouping of patterns and similarity levels at which groupings change [19]. Partitional cl...

متن کامل

An improved column generation algorithm for minimum sum-of-squares clustering

Given a set of entities associated with points in Euclidean space, minimum sum-of-squares clustering (MSSC) consist in partitioning this set into clusters such that the sum of squared distances from each point to the centroid of its cluster is minimized. A column generation algorithm for MSSC was given in du Merle et al. [15]. The bottleneck of that algorithm is resolution of the auxiliary prob...

متن کامل

An Effective Semi-supervised Divisive Clustering Algorithm

Nowadays, data are generated massively and rapidly from scientific fields such as bioinformatics, neuroscience and astronomy to business and engineering fields. Cluster analysis, as one of the major data analysis tools, is therefore more significant than ever. Here, we propose an effective Semi-supervised Divisive Clustering algorithm (SDC). Data points are first organized by a minimal spanning...

متن کامل

An Improved Semi-supervised Fuzzy Clustering Algorithm

Semi-supervised clustering is an important method which can improve clustering performance by introducing partial supervised information. This paper mainly studies the semi-supervised fuzzy clustering based on Mahalanobis distance and Gaussian Kernel for SCAPC algorithm. Here, we give a new semi-supervised fuzzy clustering objective function. By solving the optimization problem with above objec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computers & Operations Research

سال: 2022

ISSN: ['0305-0548', '1873-765X']

DOI: https://doi.org/10.1016/j.cor.2022.105958